Does a Computational Linguist have to be a Linguist?

نویسنده

  • Martin Kay
چکیده

Early computational linguists supplied much of theoretical basis that the ALPAC report said was needed for research on the practical problem of machine translation. The result of their efforts turned out to be more fundamental in that it provided a general theoretical basis for the study of language use as a process, giving rise eventually to constraint-based grammatical formalisms for syntax, finite-state approaches to morphology and phonology, and a host of models how speakers might assemble sentences, and hearers take them apart. Recently, an entirely new enterprise, based on machine learning and big data, has sprung on the scene and challenged the ALPAC committee’s finding that linguistic processing must have a firm basis in linguistic theory. In this talk, I will show that the long-term development of linguistic processing requires linguistic theory, sophisticated statistical manipulation of big data, and a third component which is not linguistic at all.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pieces for a Global Puzzle

My official rhetorical position in this paper, that of an ordinary linguist talking to computational linguists, is rapidly becoming obsolete. In a near future, there will be no non-obsolete ordinary linguists who are not also computational linguists, and no non-obsolete computational linguists who are not also ordinary linguists. So, in anticipation of the near future, I will talk as a linguist...

متن کامل

An Interlingual Lexical Organisation Based on Acceptions From the PARAX mock-up to the NADIA system

Many projects are conducted to develop multilingual lexical databases. Some of these projects use an interlingual approach (KBMT-89, EDR, ...), where others choose a bilingual approach (Multilex, ...). This paper presents an interlingual approach based on acceptions (word-senses) aiming at the development of a multilingual lexical database management system: NADIA. With this approach, the inter...

متن کامل

Incorporating Linguistic Expertise Using ILP for Named Entity Recognition in Data Hungry Indian Languages

Developing linguistically sound and data-compliant rules for named entity annotation is usually an intensive and time consuming process for any developer or linguist. In this work, we present the use of two Inductive Logic Programming (ILP) techniques to construct rules for extracting instances of various named entity classes thereby reducing the efforts of a linguist/developer. Using ILP for r...

متن کامل

On De Re Predicates

Thus, we conclude that a modal like ‘think’ is a necessary ingredient for de re/de dicto ambiguity. Roughly put, the ambiguity of (1) can be understood as follows. In the interpretation where ‘the linguist’ refers to Sue, it is evaluated against what we know, so the referent of ‘the linguist’ is who we know as the unique linguist, i.e. Sue. This is the same for the unembedded sentence in (4). O...

متن کامل

Applications of a Computer System for Transformational Grammar

Writing a transformational grammar for even a fragment of a natural language is a task of a high order of complexity. Not only must the individual rules of the grammar perform as intended in isolation, but the rules must work correctly together in order to pro~nce the desired results. The details of grammar-writing are likely to be regarded as secondary by the linguist, who is most concerned wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014